Goto

Collaborating Authors

 manipulation strategy


DexCanvas: Bridging Human Demonstrations and Robot Learning for Dexterous Manipulation

Xu, Xinyue, Sun, Jieqiang, Jing, null, Dai, null, Chen, Siyuan, Ma, Lanjie, Sun, Ke, Zhao, Bin, Yuan, Jianbo, Yi, Sheng, Zhu, Haohua, Lu, Yiwen

arXiv.org Artificial Intelligence

We present DexCanvas, a large-scale hybrid real-synthetic human manipulation dataset containing 7,000 hours of dexterous hand-object interactions seeded from 70 hours of real human demonstrations, organized across 21 fundamental manipulation types based on the Cutkosky taxonomy (Feix et al., 2016). Each entry combines synchronized multi-view RGB-D, high-precision mocap with MANO hand parameters, and per-frame contact points with physically consistent force profiles. Our real-to-sim pipeline uses reinforcement learning to train policies that control an actuated MANO hand in physics simulation, reproducing human demonstrations while discovering the underlying contact forces that generate the observed object motion. DexCanvas is the first manipulation dataset to combine large-scale real demonstrations, systematic skill coverage based on established taxonomies, and physics-validated contact annotations. The dataset can facilitate research in robotic manipulation learning, contact-rich control, and skill transfer across different hand morphologies. Dexterous manipulation with high-DoF anthropomorphic hands is fundamental to robot learning: it enables the most general form of object interaction and is essential for robots to achieve human-level autonomy in unstructured environments (Y u & Wang, 2022; Ozawa & Tahara, 2017). The field has witnessed rapid advancement along two dimensions: diverse learning paradigms including reinforcement learning for contact-rich control (Chen et al., 2024; 2023) and diffusion-based methods for handling multimodal action distributions (Weng et al., 2024; Wu et al., 2024), alongside dramatic scale expansion from task-specific models to billion-parameter foundation models (Wen et al., 2025; Kim et al., 2024; Zitkovich et al., 2023). However, current flagship manipulation systems predominantly rely on parallel-jaw grippers, while generalizable control of anthropomorphic hands remains limited to simulation or narrow real-world scenarios. This gap highlights an opportunity: to unlock the full potential of dexterous manipulation, we need large-scale datasets that capture diverse human manipulation strategies with physically accurate contact dynamics and force profiles, the crucial signals for learning robust dexterous control. Building such datasets requires careful consideration of data sources and collection methodologies. The choice between robot-generated and human-sourced data presents fundamental tradeoffs for learning manipulation.


Flexible and Foldable: Workspace Analysis and Object Manipulation Using a Soft, Interconnected, Origami-Inspired Actuator Array

Dacre, Bailey, Moreno, Rodrigo, Demirtas, Serhat, Wang, Ziqiao, Jiang, Yuhao, Paik, Jamie, Stoy, Kasper, Faíña, Andrés

arXiv.org Artificial Intelligence

Object manipulation is a fundamental challenge in robotics, where systems must balance trade-offs among manipulation capabilities, system complexity, and throughput. Distributed manipulator systems (DMS) use the coordinated motion of actuator arrays to perform complex object manipulation tasks, seeing widespread exploration within the literature and in industry. However, existing DMS designs typically rely on high actuator densities and impose constraints on object-to-actuator scale ratios, limiting their adaptability. We present a novel DMS design utilizing an array of 3-DoF, origami-inspired robotic tiles interconnected by a compliant surface layer. Unlike conventional DMS, our approach enables manipulation not only at the actuator end effectors but also across a flexible surface connecting all actuators; creating a continuous, controllable manipulation surface. We analyse the combined workspace of such a system, derive simple motion primitives, and demonstrate its capabilities to translate simple geometric objects across an array of tiles. By leveraging the inter-tile connective material, our approach significantly reduces actuator density, increasing the area over which an object can be manipulated by x1.84 without an increase in the number of actuators. This design offers a lower cost and complexity alternative to traditional high-density arrays, and introduces new opportunities for manipulation strategies that leverage the flexibility of the interconnected surface.


Surface-Based Manipulation

Wang, Ziqiao, Demirtas, Serhat, Zuliani, Fabio, Paik, Jamie

arXiv.org Artificial Intelligence

Intelligence lies not only in the brain but in the body. The shape of our bodies can influence how we think and interact with the physical world. In robotics research, interacting with the physical world is crucial as it allows robots to manipulate objects in various real-life scenarios. Conventional robotic manipulation strategies mainly rely on finger-shaped end effectors. However, achieving stable grasps on fragile, deformable, irregularly shaped, or slippery objects is challenging due to difficulties in establishing stable force or geometric constraints. Here, we present surface-based manipulation strategies that diverge from classical grasping approaches, using with flat surfaces as minimalist end-effectors. By changing the position and orientation of these surfaces, objects can be translated, rotated and even flipped across the surface using closed-loop control strategies. Since this method does not rely on stable grasp, it can adapt to objects of various shapes, sizes, and stiffness levels, even enabling the manipulation the shape of deformable objects. Our results provide a new perspective for solving complex manipulation problems.

  Country: Europe > Switzerland (0.14)
  Genre: Research Report > New Finding (0.87)
  Industry:

One-Shot Manipulation Strategy Learning by Making Contact Analogies

Liu, Yuyao, Mao, Jiayuan, Tenenbaum, Joshua, Lozano-Pérez, Tomás, Kaelbling, Leslie Pack

arXiv.org Artificial Intelligence

We present a novel approach, MAGIC (manipulation analogies for generalizable intelligent contacts), for one-shot learning of manipulation strategies with fast and extensive generalization to novel objects. By leveraging a reference action trajectory, MAGIC effectively identifies similar contact points and sequences of actions on novel objects to replicate a demonstrated strategy, such as using different hooks to retrieve distant objects of different shapes and sizes. Our method is based on a two-stage contact-point matching process that combines global shape matching using pretrained neural features with local curvature analysis to ensure precise and physically plausible contact points. We experiment with three tasks including scooping, hanging, and hooking objects. MAGIC demonstrates superior performance over existing methods, achieving significant improvements in runtime speed and generalization to different object categories. Website: https://magic-2024.github.io/ .


Adaptive Manipulation using Behavior Trees

Cloete, Jacques, Merkt, Wolfgang, Havoutis, Ioannis

arXiv.org Artificial Intelligence

Many manipulation tasks use instances of a set of common motions, such as a twisting motion for tightening or loosening a valve. However, different instances of the same motion often require different environmental parameters (e.g. force/torque level), and thus different manipulation strategies to successfully complete; for example, grasping a valve handle from the side rather than head-on to increase applied torque. Humans can intuitively adapt their manipulation strategy to best suit such problems, but representing and implementing such behaviors for robots remains an open question. We present a behavior tree-based approach for adaptive manipulation, wherein the robot can reactively select from and switch between a discrete set of manipulation strategies during task execution. Furthermore, our approach allows the robot to learn from past attempts to optimize performance, for example learning the optimal strategy for different task instances. Our approach also allows the robot to preempt task failure and either change to a more feasible strategy or safely exit the task before catastrophic failure occurs. We propose a simple behavior tree design for general adaptive robot behavior and apply it in the context of industrial manipulation. The adaptive behavior outperformed all baseline behaviors that only used a single manipulation strategy, markedly reducing the number of attempts and overall time taken to complete the example tasks. Our results demonstrate potential for improved robustness and efficiency in task completion, reducing dependency on human supervision and intervention.


Decentralized Online Learning in General-Sum Stackelberg Games

Yu, Yaolong, Chen, Haipeng

arXiv.org Artificial Intelligence

We study an online learning problem in general-sum Stackelberg games, where players act in a decentralized and strategic manner. We study two settings depending on the type of information for the follower: (1) the limited information setting where the follower only observes its own reward, and (2) the side information setting where the follower has extra side information about the leader's reward. We show that for the follower, myopically best responding to the leader's action is the best strategy for the limited information setting, but not necessarily so for the side information setting -- the follower can manipulate the leader's reward signals with strategic actions, and hence induce the leader's strategy to converge to an equilibrium that is better off for itself. Based on these insights, we study decentralized online learning for both players in the two settings. Our main contribution is to derive last-iterate convergence and sample complexity results in both settings. Notably, we design a new manipulation strategy for the follower in the latter setting, and show that it has an intrinsic advantage against the best response strategy. Our theories are also supported by empirical results.


What Does the Bot Say? Opportunities and Risks of Large Language Models in Social Media Bot Detection

Feng, Shangbin, Wan, Herun, Wang, Ningnan, Tan, Zhaoxuan, Luo, Minnan, Tsvetkov, Yulia

arXiv.org Artificial Intelligence

Social media bot detection has always been an arms race between advancements in machine learning bot detectors and adversarial bot strategies to evade detection. In this work, we bring the arms race to the next level by investigating the opportunities and risks of state-of-the-art large language models (LLMs) in social bot detection. To investigate the opportunities, we design novel LLM-based bot detectors by proposing a mixture-of-heterogeneous-experts framework to divide and conquer diverse user information modalities. To illuminate the risks, we explore the possibility of LLM-guided manipulation of user textual and structured information to evade detection. Extensive experiments with three LLMs on two datasets demonstrate that instruction tuning on merely 1,000 annotated examples produces specialized LLMs that outperform state-of-the-art baselines by up to 9.1% on both datasets, while LLM-guided manipulation strategies could significantly bring down the performance of existing bot detectors by up to 29.6% and harm the calibration and reliability of bot detection systems.


The Multi-fingered Kinematic Model for Dual-arm Manipulation

Li, Jingyi

arXiv.org Artificial Intelligence

A planar kinematic model in the hand-object coordinates system for bimanual manipulation is presented. It can compute and determine the fingers configurations. In our experiment, the desired positions, as the model inputs are successfully generated valid joints values for bimanual manipulation. Abstract This paper presents the planar finger kinematic model for dual-arm robot to determine manipulation strategies. The first step is to model based on planar geometric features of the coordinated and rolling motion so that the robot can select the fingers configurations. For the hand-object model, we consider the distances between object and hands as the constraints. The second step is to seek the appropriate values of finger joints based on their positions samples which are randomly generated. Here the robot selects these positions according to the displacements of each joint and the k means clustering. The simulation shows that the selected solutions for the manipulation are all in the finger work space.


The Hand-object Kinematic Model for Bimanual Manipulation

Li, Jingyi

arXiv.org Artificial Intelligence

This paper addresses the planar finger kinematics for seeking optimized manipulation strategies. The first step is to model based on geometric features of linear and rotation motion so that the robot can select the fingers configurations. This kinematic model considers the motion between hands and object. Based on 2-finger manipulation cases, this model can output the strategies for bimanual manipulation. For executing strategies, the second step is to seek the appropriate values of finger joints according to the ending orientation of fingers. The simulation shows that the computed solutions can complete the relative rotation and linear motion of unknown objects.


Linear Delta Arrays for Compliant Dexterous Distributed Manipulation

Patil, Sarvesh, Tao, Tony, Hellebrekers, Tess, Kroemer, Oliver, Temel, F. Zeynep

arXiv.org Artificial Intelligence

This paper presents a new type of distributed dexterous manipulator: delta arrays. Our delta array setup consists of 64 linearly-actuated delta robots with 3D-printed compliant linkages. Through the design of the individual delta robots, the modular array structure, and distributed communication and control, we study a wide range of in-plane and out-of-plane manipulations, as well as prehensile manipulations among subsets of neighboring delta robots. We also demonstrate dexterous manipulation capabilities of the delta array using reinforcement learning while leveraging the compliance to not break the end-effectors. Our evaluations show that the resulting 192 DoF compliant robot is capable of performing various coordinated distributed manipulations of a variety of objects, including translation, alignment, prehensile squeezing, lifting, and grasping.